Create statistics on completion of upper secondary school
This script can be used as a starting point for creating statistics on completion of upper secondary education. Note that the results generated by the script cannot be considered official statistics without further quality control and processing of the data.
The script is based on an input from some of our users, where we try to reproduce figures on the completion rate for students at upper secondary schools, more specifically Statistics Norway's Statistics bank table: https://www.ssb.no/statbank/table/12971/tableViewLayout1 /
The statistics bank table in question measures students who started upper secondary school in autumn 2017, where students are followed until the end of the spring semester 2023. A measurement period of 5 years is used for general education, and 6 years for vocational education. Furthermore, one looks at completion in the normalized time and various subcategories for those who do not complete in the normalized time. The actual numbers look like this:
- In total (64,392)
- Completed with study or vocational skills in the standard time (46,310)
- Completed with study or vocational skills in more than the standard time (6,649)
- Completed planned basic skills within five/six years (1,648)
- Still in upper secondary education after five/six years, without having completed (1,723)
- Completed vg3/passed the vocational/apprenticeship test, failed (1,856)
- Ended en route (6,206)
The biggest challenge is to identify "standard time" for students in vocational education, since this varies greatly depending on the type of subject taken.
About standard time (source Statistics Norway):
Standard time: Standard time means the time in which a program area in upper secondary education must be completed in accordance with the curriculum for a full-time student. Standard time normally corresponds to three years' training for pupils and four years' training for apprentices (two years in school and two years in a company).
Certain program areas have an apprenticeship of two and a half years and in some cases also three years, combined with two years at school. Some come up in 5 years standard time, three years in school combined with 2 years in apprenticeship. The latter applies in particular within the electrical studies programme, but also within building and construction engineering and design and craftsmanship. In order to take into account different lengths of the standard courses within vocational education programmes, standard apprenticeships from VIGO's code database are used.
In the script below, we make some simplifications related to "standard time". As a result, the figures are not fully comparable with the official figures from Statistics Norway. In order to make it more comparable, one should make more detailed assessments of what is the standard time associated with the various vocational educations.
Another challenge is to recreate all the numbers in the group that do not complete during the measurement intervall. The script does not quite achieve this, but nevertheless creates other interesting numbers that may be of relevance. Among other things, course data is used to check whether students who do not finish in the normal time are still studying after the end of the measurement period. This gives an indication of whether you have dropped out of upper secondary school or taken a free year, or not.
require no.ssb.fdb:34 as db
create-dataset vgstudents
import db/NUDB_AAR_NY_I_VID_UTD_LOV as first_time_reg_highschool
keep if first_time_reg_highschool == 201708
import db/NUDB_AAR_FORSTE_FULLF_VS_LOV as first_time_completed_highschool
import db/NUDB_SEMESTER_FFF_VS_LOV as num_semesters_highschool
import db/NUDB_AAR_FORSTE_FULLF_VSA_LOV as first_time_completed_general
import db/NUDB_AAR_FORSTE_FULLF_VSY_LOV as first_time_completed_vocational
import db/NUDB_SEMESTER_FFF_VSA_LOV as num_semesters_general
import db/NUDB_SEMESTER_FFF_VSY_LOV as num_semesters_vocational
tabulate num_semesters_highschool, missing
tabulate first_time_completed_highschool, missing
tabulate num_semesters_general, missing
tabulate first_time_completed_general, missing
tabulate num_semesters_vocational, missing
tabulate first_time_completed_vocational, missing
//Connecting course data to check if some are still studying after the measurement period
create-dataset coursedata
import db/NUDB_KURS_NUS 2023-09-01 as course_type
import db/NUDB_KURS_KOMP 2023-09-01 as highschool_competence
import db/NUDB_KURS_FNR as personid
destring course_type highschool_competence
collapse(count) course_type -> num_courses (max) course_type -> highest_level (max) highschool_competence, by(personid)
merge num_courses highest_level highschool_competence into vgstudents
use vgstudents
generate in_education = num_courses > 0
summarize highest_level
replace highest_level = int(highest_level/100000)
//Creating mutually exclusive statuses for completion of high school
generate status = 1 if num_semesters_general <= 6 | num_semesters_vocational <= 10
replace status = 2 if num_semesters_general > 6 | num_semesters_vocational > 10
//replace status = 3 if .... //completed basic competence within the timeframe of 5-6 years
replace status = 4 if !inlist(status,1,2) & in_education & inrange(highest_level,3,5)
replace status = 5 if !inlist(status,1,2) & in_education & inrange(highest_level,6,7)
replace status = 6 if !inlist(status,1,2) & !in_education
define-labels statuslbl 1 'Completed on standard time' 2 'Completed on more than standard time' 3 'Completed basic competence within 5-6 years' 4 'Not completed within the timeframe, but still in education at high school level' 5 'Not completed within the timeframe, but still in education at higher education level' 6 'Not completed, and discontinued education'
assign-labels status statuslbl
tabulate status, missing
tabulate highschool_competence status, missing
textblock
Comments on the numbers:
- 1 - Completed on standard time: 47,968 vs 46,310 - deviation due to variation in what is considered standard time, especially within vocational subjects
- 2 - Completed on more than standard time: 5,732 vs 6,649 - deviation due to the same as above
- 3 - Completed basic competence: Difficult to find numbers on this in microdata.no
- 4 - Not completed within the timeframe, but still in education at high school level: 1,911 - based on course data measured as of September 1, 2023. The SSB numbers show that 1,723 do not complete within the timeframe and continue with high school education. Our numbers match these quite well.
1,032 of 1,911 with status = 4 have the competence code 2, i.e. "Vocational competence documented with a journeyman's certificate or certificate of apprenticeship, normal apprenticeship after two years in school"
- 5 - Not completed within the timeframe, but still in education at higher education level: 517 - based on course data measured as of September 1, 2023 (most of these must have completed high school and therefore belong to status = 1 or 2, so the numbers there should be adjusted accordingly)
- 6 - Not completed within the timeframe, and discontinued education: 10,234 - according to the SSB numbers, 6,206 of these have dropped out along the way, which means that 4,028 are taking a break from their education
endblock